Skip to content

Conversation

@connermanuel
Copy link
Contributor

this PR adjusts the behavior of train_on_inputs.
if train type is SFT, we include this parameter in the TrainingMethod and default it to auto.
if train type is DPO, we default to None and raise if parameter is supplied.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatting only

@connermanuel connermanuel force-pushed the cmanuel/eng-24978-move-train_on_inputs-to-the-parameters-of-sft-training branch 2 times, most recently from 8ef8769 to 1ce13d6 Compare May 8, 2025 22:03
)
train_on_inputs = "auto"

if dpo_beta is not None and training_method != "dpo":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other option might be just a warning. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question! i don't have a strong opinion here, but i am leaning towards the error because non-None value means the user intentionally set it. if they intentionally set it, it might be because they assumed dpo would be active. so we should cancel and let them know that it isn't, rather than silently continuing with an sft job. wdyt @artek0chumak ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an imediate error, not something from FT API or from a job. I think it's fine to leave it us an error

Copy link
Contributor

@artek0chumak artek0chumak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@connermanuel connermanuel force-pushed the cmanuel/eng-24978-move-train_on_inputs-to-the-parameters-of-sft-training branch from 01ab30b to 8331552 Compare May 13, 2025 13:37
@connermanuel connermanuel requested a review from artek0chumak May 13, 2025 13:37
@connermanuel connermanuel force-pushed the cmanuel/eng-24978-move-train_on_inputs-to-the-parameters-of-sft-training branch from 8331552 to 77601a9 Compare May 13, 2025 18:40
@connermanuel connermanuel merged commit c026b7b into main May 13, 2025
10 checks passed
@connermanuel connermanuel deleted the cmanuel/eng-24978-move-train_on_inputs-to-the-parameters-of-sft-training branch May 13, 2025 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants